Using Morphology And Syntax Together In Unsupervised Learning

نویسندگان

  • Yu Hu
  • Irina Matveeva
  • John Goldsmith
  • Colin Sprague
چکیده

Unsupervised learning of grammar is a problem that can be important in many areas ranging from text preprocessing for information retrieval and classification to machine translation. We describe an MDL based grammar of a language that contains morphology and lexical categories. We use an unsupervised learner of morphology to bootstrap the acquisition of lexical categories and use these two learning processes iteratively to help and constrain each other. To be able to do so, we need to make our existing morphological analysis less fine grained. We present an algorithm for collapsing morphological classes (signatures) by using syntactic context. Our experiments demonstrate that this collapse preserves the relation between morphology and lexical categories within new signatures, and thereby minimizes the description length of the model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Learning of Morphology by using Syntactic Categories

This paper presents a method for unsupervised learning of morphology that exploits the syntactic categories of words. Previous research [4][12] on learning of morphology and syntax has shown that both kinds of knowledge affect each other making it possible to use one type of knowledge to help the other. In this work, we make use of syntactic information i.e. Part-of-Speech (PoS) tags of words t...

متن کامل

Biologically-Motivated Machine Learning of Natural Language and Ontology A Computational Cognitive Model

The individual cognitive science disciplines all have contributions to make to the understanding and modelling of human learning. Our previous research has explored unsupervised learning of phonology, morphology and low-level syntax, as well as basic noun, verb and preposition ontology and semantics, plus musical and speech prosody. Successful applications using a mix of supervised and unsuperv...

متن کامل

Modeling Acquisition of Word Structure with Lexicalized Grammar Learning

Introduction This paper introduces a framework for learning structure in natural languages, and reports results from a simple application of it to learning word-syntax of an agglutinative language in an unsupervised manner. Arguably, the learning environment of children acquiring languages provides more information—by means of linguistic interaction and extralinguistic information present in th...

متن کامل

Modeling Acquisition of Word Structure with Lexicalized Grammar Learning

This paper introduces a framework for learning structure in natural languages, and reports results from a simple application of it to learning word-syntax of an agglutinative language in an unsupervised manner. Arguably, the learning environment of children acquiring languages provides more information—by means of linguistic interaction and extralinguistic information present in the learning se...

متن کامل

Efficient, Correct, Unsupervised Learning for Context-Sensitive Languages

A central problem for NLP is grammar induction: the development of unsupervised learning algorithms for syntax. In this paper we present a lattice-theoretic representation for natural language syntax, called Distributional Lattice Grammars. These representations are objective or empiricist, based on a generalisation of distributional learning, and are capable of representing all regular languag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005